Locative trigrams in Northern Sotho, preceded by analyses of formative bigrams
نویسندگان
چکیده
In Northern Sotho one of the strategies to express locality makes use of locative particle groups, being complements preceded by any of the so-called locative particles ka, kua, mo, ga, or go. Current linguistic descriptions shy away from those cases where sequences of such particles are employed. In this article these sequences are termed ‘‘locative n-grams’’ and are studied for the first time. It will be shown that, synchronically, just a handful of locative trigrams and bigrams do actually occur in a relatively large corpus. An in-depth study of the examples allows taking stock of the existing structures, provides data regarding the distribution of all the n-grams, and hints at the semantic content as well as the restrictions posed on the nature of the complements. In order to get clarity on the latter two aspects, a diachronic approach is often pursued. As a by-product, the study of the higher-order n-grams also brings hitherto overlooked features of the unigrams to light. The main research question that drove this investigation was thus to find out whether or not higher-order locative n-grams exist in Northern Sotho. As the answer was found to be positive, the major objective became to describe the found structures minutely by drawing on corpus data.
منابع مشابه
Scalable Trigram Backoff Language Models
When a trigram backoff language model is created from a large body of text, trigrams and bigrams that occur few times in the training text are often excluded from the model in order to decrease the model size. Generally, the elimination of n-grams with very low counts is believed to not significantly affect model performance. This project investigates the degradation of a trigram backoff model’...
متن کاملScalable backoff language models
When a trigram backoff language model is created from a large body of text, trigrams and bigrams that occur few times in the training text are often excluded from the model in order to decrease the model size. Generally, the elimination of n-grams with very low counts is believed to not significantly affect model performance. This project investigates the degradation of a trigram backoff model’...
متن کاملMorphosyntactic discrepancies in representing the adjective equivalent in African WordNet with reference to Northern Sotho
This paper aims to highlight morphosyntactic discrepancies encountered in representing the adjective equivalent in African WordNet, with reference to Northern Sotho. Northern Sotho is an agglutinating language with rich and productive morphology. The language also features a disjunctive orthographic system. The orthography determines the attachment selection of morphemes. The immediate issue, i...
متن کاملFinite state tokenisation of an orthographical disjunctive agglutinative language: The verbal segment of Northern Sotho
Tokenisation is an important first pre-processing step required to adequately test finite-state morphological analysers. In agglutinative languages each morpheme is concatinatively added on to form a complete morphological structure. Disjunctive agglutinative languages like Northern Sotho write these morphemes, for certain morphological categories only, as separate words separated by spaces or ...
متن کاملDevelopment of prototype text-to-speech systems for northern sotho
Two text-to-speech synthesis systems were developed for one of the eleven official languages of South Africa, viz. Northern Sotho. A diphone synthesis system, based on extraction of diphones from nonsense words, was constructed. A cluster unit selection synthesis system, based on recordings of sentences containing a selection of most common words in Northern Sotho, was also built. The Festival ...
متن کامل